Achieving Adequacy of Description of Multiword Entities in Semantically-Oriented Computational Lexicons
نویسندگان
چکیده
This article discusses three aspects of recording multiword expressions (MWEs) in semantically oriented lexicons for NLP: achieving syntactic adequacy , achieving semantic adequacy, and computing the semantic contribution of non-compositional elements. The purpose of the analysis is twofold: first, to provide a descriptive, example-based account of how complex aspects of MWEs can be treated in computational lexicons; second, to bring attention to some aspects of MWEs that are not currently being treated by most systems but must be treated if we are to achieve truly sophisticated natural language processing .
منابع مشابه
Bootstrapping Semantic Lexicons for Technical Domains
We address the task of bootstrapping a semantic lexicon from a list of seed terms and a large corpus. By restricting to a small subset of semantically strong patterns, i.e., coordinations, we improve results significantly. We show that the restriction to coordinations has several additional benefits, such as improved extraction of multiword expressions, and the possibility to scale up previous ...
متن کاملConstructing Bilingual Multiword Lexicons for a Resource-Poor Language Pair
This paper presents a method for constructing bilingual multiword lexicons for a resource-poor language pair such as Korean–French. For this, at first, we identify multiword candidates from parallel corpora, and then use the pivot context approach [1] to align those candidates. Our empirical study shows encouraging results (e.g., accuracy), even though this study is ongoing.
متن کاملTowards Best Practice for Multiword Expressions in Computational Lexicons
The importance and role of multi-word expressions (MWE) in the description and processing of natural language has been long recognized. However, multi-word information has often been relegated to the marginal role of idiosyncratic lexical information. The need for MWE lexicons grows even more acute for multi-lingual applications, for which (sometimes complex) correspondences must be identified,...
متن کاملIntegration of Reduplicated Multiword Expressions and Named Entities in a Phrase Based Statistical Machine Translation System
The language specific Multiword expressions (MWEs) play important roles in many natural language processing (NLP) tasks. Integrating reduplicated multiword expressions (RMWEs) into the Phrase Based Statistical Machine Translation (PBSMT) to improve translation quality is reported in the present work between Manipuri, a highly agglutinative Tibeto-Burman language and English. In addition, Multiw...
متن کاملAn Experiment of Lexical-Semantic Tagging of an Italian Corpus
The availability of semantically tagged corpora is becoming a very important and urgent need for training and evaluation within a large number of applications but also they are the natural application and accompaniment of semantic lexicons of which they constitute both a useful testbed to evaluate their adequacy and a repository of corpus examples for the attested senses. It is therefore essent...
متن کامل